Contrastive Feature Induction for Efficient Structure Learning of Conditional Random Fields

نویسندگان

  • Ni Lao
  • Jun Zhu
چکیده

Structure learning of Conditional Random Fields (CRFs) can be cast into L1-regularized optimization problems, which can be solved by efficient gradient-based optimization methods. However, optimizing a fully linked model may require inference on dense graphs which can be time prohibitive and inaccurate. Gain-based or gradient-based feature selection methods avoid this problem by starting from an empty model and incrementally adding top ranked features to it. However, training time with these incremental methods can be dominated by the cost of evaluating the gain or gradient of candidate features for high-dimensional problems. In this study we propose a fast feature evaluation algorithm called Contrastive Feature Induction (CFI) based on Mean Field Contrastive Divergence (CD MF ). CFI only evaluates a subset of features which involve variables with high signals (deviation from mean) or errors (prediction residue). We prove that the gradient of candidate features can be represented solely as a function of signals and errors, and that CFI is an efficient approximation of gradient-based evaluation methods. Experiments on synthetic and real datasets show competitive learning speed and accuracy of CFI on pairwise CRFs, compared to state-of-the-art structure learning methods such as full optimization over all features, and Grafting. More interestingly, CFI is not only faster than other methods but also produces models with higher prediction accuracy by focusing on large prediction errors during induction.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning with Blocks: Composite Likelihood and Contrastive Divergence

Composite likelihood methods provide a wide spectrum of computationally efficient techniques for statistical tasks such as parameter estimation and model selection. In this paper, we present a formal connection between the optimization of composite likelihoods and the well-known contrastive divergence algorithm. In particular, we show that composite likelihoods can be stochastically optimized b...

متن کامل

Punctuation Prediction using Linear Chain Conditional Random Fields

We investigate the task of punctuation prediction in English sentences without prosodic information. In our approach, stochastic gradient ascent (SGA) is used to maximize log conditional likelihood when learning the parameters of linear-chain conditional random fields. For SGA, two different approximation techniques, namely Collins perceptron and contrastive divergence, are used to estimate the...

متن کامل

Efficiently Inducing Features of Conditional Random Fields

Conditional Random Fields (CRFs) are undirected graphical models, a special case of which correspond to conditionally-trained finite state machines. A key advantage of CRFs is their great flexibility to include a wide variety of arbitrary, non-independent features of the input. Faced with this freedom, however, an important question remains: what features should be used? This paper presents an ...

متن کامل

Contrastive Estimation: Training Log-Linear Models on Unlabeled Data

Conditional random fields (Lafferty et al., 2001) are quite effective at sequence labeling tasks like shallow parsing (Sha and Pereira, 2003) and namedentity extraction (McCallum and Li, 2003). CRFs are log-linear, allowing the incorporation of arbitrary features into the model. To train on unlabeled data, we require unsupervised estimation methods for log-linear models; few exist. We describe ...

متن کامل

Conditional Random Fields for Word Hyphenation

Word hyphenation is an important problem which has many practical applications. The problem is challenging because of the vast amount of English words. We use linear-chain Conditional Random Fields (CRFs) that has efficient algorithms to learn and to predict hyphen of English words that do not appear in the training dictionary. In this report, we are interested in finding 1) an efficient optimi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1406.7445  شماره 

صفحات  -

تاریخ انتشار 2012